New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lambda.subst: also update debug event environments #1751
Conversation
Very naive question: couldn't |
As I said:
But to answer more precisely: I'm not sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What the theory tells us is that a substitution by an environment env
goes from an enviromnent E1
to an environment E2
if, for all x \in E1
, the term env(x)
is well scoped in E2
. Then, for any term t
that is well-scoped is E1
, subst env t
is well-scoped in E2
. The variable-renaming cases corresponds to the case where env(x)
is also a variable, and then has to belong to E2
. (In the actual implementation, we only give a fragment of the environment with the notion that variables not in the fragment are unchanged, so we rather go from E,E1
to E,E2
for a common, untouched environment E
).
A general way to make subst
do the right thing on environments would be to add a parameter of type variable -> value_description -> env -> env
, which for a given variable x : vd
and result environment E2
, extends E2
to contain the environment necessary for env(x)
. In particular, for a variable->variable renaming the function would just add the renamed binder to the input environment, and for substitutions that turn a variable into a closed term the function would return the input environment unchanged.
In the implementation of Lambda.subst
, an auxiliary function of type
(var -> vd -> Env.t -> Env.t) -> _ Ident.Map.t -> summary -> summary
would be handy.
Re. potential specializations, I looked at uses of Lambda.subst
, they seem to all be one of the two forms:
- variable->variable renamings
- substituting a variable with a closed term (in translclass and translmod),
or at least a term with free variables present only in the unchanged, common environment
(neither in the support of theenv
fragment, nor new variables)
This suggests that having two specialized traversals would also work -- but with more code duplication in Lambda.
@@ -263,7 +263,7 @@ and lambda_event = | |||
{ lev_loc: Location.t; | |||
lev_kind: lambda_event_kind; | |||
lev_repr: int ref option; | |||
lev_env: Env.summary } | |||
lev_env: Env.t } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a wise choice? I thought that Env.summary
was designed precisely to be the "compact" part of the environment for use in debug information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where you got that from, but you probably have better sources than I do. I'm also not sure what you're worried about: we still "summarize" these environment when transforming lambda to the next AST, they are never marshalled or anything.
Considering we still have modifications to do on the environment at this point, I think it make more sense to actually modify Env.t
s than the summaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One wouldn't think of looking there but I got it from env.mli :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I hadn't looked there.
Then again, depends how you interpret it. Does "compact" refer to the marshalled representation that end up in .cmo
files (or similar), or to something else? My bet is on the former, in which case I believe my change is fine.
I update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your last commit, which is not terribly surprising as it is close to my substitution-and-scoping rants above. I would encourage you to rebase eventually to remove the previous, more hackish approach from the patchset.
I am still not convinced by the Env.summary => Env.t
change in lambda. I think of Env.summary
as "inspectable data" and Env.t
as "opaque stuff", so to me it seems proper to use the former as part of an intermediate representation such as Lambda.
Is there something problematic in writing an auxiliary function that takes an Env.t -> Env.t
transformation, and turns it into an Env.summary -> Env.summary
transformation? This seems easy to do from the API, is there some efficiency concern?
bytecomp/lambda.ml
Outdated
@@ -590,7 +590,7 @@ let rec make_sequence fn = function | |||
Assumes that the image of the substitution is out of reach | |||
of the bound variables of the lambda-term (no capture). *) | |||
|
|||
let rec subst s lam = | |||
let rec subst update_env s lam = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update_env
function remains constant over the traversal, so it doesn't need to be a parameter to inner recursive calls. Could you maybe wrap the mutual recursion under a toplevel let subst update_env s lam =
function, so that the patch is less invasive and the code lighter? (You still need an indentation bump.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bytecomp/lambda.mli
Outdated
@@ -334,7 +334,7 @@ val transl_class_path: ?loc:Location.t -> Env.t -> Path.t -> lambda | |||
|
|||
val make_sequence: ('a -> lambda) -> 'a list -> lambda | |||
|
|||
val subst: lambda Ident.Map.t -> lambda -> lambda | |||
val subst: (Ident.t -> Types.value_description -> Env.t -> Env.t) -> lambda Ident.Map.t -> lambda -> lambda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could extend the comment with a word on what the new parameter does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do once we settle on what it should actually do.
bytecomp/simplif.ml
Outdated
) | ||
let update_env idmap oldid vd env = | ||
let newid = Ident.Map.find oldid idmap in | ||
Env.add_value newid vd env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This toplevel definition feels a bit lost, and then the use-sites duplicate idmap-building logic. Why don't you just expose a variant of subst
called rename
, here or directly in Lambda, that takes a list of (old_id, new_id)
pairs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bytecomp/lambda.ml
Outdated
Ident.Map.fold (fun id _ env -> | ||
match Env.find_value (Path.Pident id) evt.lev_env with | ||
| exception Not_found -> env | ||
| vd -> update_env id vd env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that in this branch update_env
should be called on a version of env
where the id
binding has been removed. Otherwise there is no way to remove identifiers from the environment when performing a substitution.
@@ -839,21 +852,22 @@ let field_of_str loc str = | |||
|
|||
|
|||
let transl_store_structure glob map prims str = | |||
let no_env_update _ _ env = env in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the implementation of substitution on environments is adapted to remove bindings before update_env
is called, then this function could/should renamed into remove_from_env
, and I believe this would be the correct behavior for the use-sites below.
And I myself remain completely unconvinced by that argument. Sorry.
There is no difficulty: But regardless of the potential cost, I just disagree with your opinion regarding the use of
Not that it should matter, but there is currently no way to remove anything from the environment, ever. |
To comment on this:
I was wondering about the impact of these substitutions (the ones done in translmod that is) on "debugability", but given that the substitutions are done then I agree a correct semantic would be to remove the bindings from the environments while performing the substitutions. I'm running a bit out of steam on this PR: I thought it would be a quick and easy (and gratifying, and whatever else you can think of!) fix that would distract me for one evening of the things I usually do; I should have known better :p Here is I believe, a summary of the state of this PR:
While I feel guilty for saying that, I don't think point 2 is a blocker and I feel like we could merge even without handling it (we're not introducing a regression, we're "only" half-fixing the issue). So it seems that we have two options here (if we don't want this PR to rot away for months):
I'd be happy with either of these two options (and somewhat sad if neither was taken...) What do you think? |
I think that your proposals for both (1) and (2) are reasonable and pragmatic. Re. (1), I agree with your point about efficiency concerns, and with the idea that consumers could always perform summarization locally. (We could think of adding a type parameter to Lambda.t that is either I would find it reassuring to have a comment from someone that knows when and how summarization really matters.) |
My intuition is that we need to summarize when we serialize (which happens for debug events, but also for cmts). I somewhat doubt that there ever was another reason, but I'm only guessing. |
I have rebased and cleaned up the history. Given that there has been no comment for the past week regarding the |
@xavierleroy: is there any reason not to change the payload of (To summarize the discussion above: the reason to use |
I dug up an old discussion on caml-devel (from 2012!) about environment summaries. The gist is that summaries are used in debugger events (in byte code executables) and The main protagonists were @lefessan and @xavierleroy. Any opinions? |
Summaries ( |
Thanks for the context! I'm still in favor of keeping the change from summaries to full env in lambda, for the same reasons as previously. That is: I had a look to see if that last argument could be used to justify pushing |
After thinking more about the issue, I have come to agree with @trefis here: having It is important that If we wanted to improve on this, several choices:
|
Slight variation: you could add a substitution constructor to the |
I'd still rather keep That said, I'm too shy to take a decision on this, so I welcome any opinion (one way or the other). |
Introduced "Lambda.rename" for var->var substitutions, this also updates the debug envs automatically. This fixes MPR#7554
Apparently given enough time, shyness goes away: I rebased this morning, the CIs appear to pass, I'll merge as is later today. |
https://caml.inria.fr/mantis/view.php?id=7554
@alainfrisch , @gasche : any opinion on the approach proposed here?
Testing
Without the changes proposed here the result is:
With the changes:
Notes about the implementation
This is a proof of concept.
Things that could/should be tried before merging:
simplify_exits
is the only place which introduces new bindings, any other place doing so should probably be updated in the same way.